LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching
نویسندگان
چکیده
Chinese short text matching is a fundamental task in natural language processing. Existing approaches usually take characters or words as input tokens. They have two limitations: 1) Some are polysemous, and semantic information not fully utilized. 2) models suffer potential issues caused by word segmentation. Here we introduce HowNet an external knowledge base propose Linguistic Enhanced graph Transformer (LET) to deal with ambiguity. Additionally, adopt the lattice maintain multi-granularity information. Our model also complementary pre-trained models. Experimental results on datasets show that our outperform various typical approaches. Ablation study indicates both important for modeling.
منابع مشابه
Text-Enhanced Representation Learning for Knowledge Graph
Learning the representations of a knowledge graph has attracted significant research interest in the field of intelligent Web. By regarding each relation as one translation from head entity to tail entity, translation-based methods including TransE, TransH and TransR are simple, effective and achieving the state-of-the-art performance. However, they still suffer the following issues: (i) low pe...
متن کاملKnowledge Enhanced Hybrid Neural Network for Text Matching
Long text brings a big challenge to semantic matching due to their complicated semantic and syntactic structures. To tackle the challenge, we consider using prior knowledge to help identify useful information and filter out noise to matching in long text. To this end, we propose a knowledge enhanced hybrid neural network (KEHNN). The model fuses prior knowledge into word representations by know...
متن کاملLinguistic knowledge for specialized text production
This paper outlines a proposal for encoding and describing verb phrase constructions in the knowledge base on the environment EcoLexicon, with the objective of helping translators in specialized text production. In order to be able to propose our own template, the characteristics and limitations of the most representative terminographic resources that include phraseological information were ana...
متن کاملChinese Short Text Classification Based on Domain Knowledge
People are generating more and more short texts. There is an urgent demand to classify short texts into different domains. Due to the shortness and sparseness of short texts, conventional methods based on Vector Space Model (VSM) have limitations. To tackle the data scarcity problem, we propose a new model to directly measure the correlation between a short text instance and a domain instead of...
متن کاملText Classification Using Graph-Encoded Linguistic Elements
Inspired by the goal to more accurately classify text, we describe an effort to map tokens and their characteristic linguistic elements into a graph and use that expressive representation to classify text phrases. We outperform the bag-of-words approach by exploiting word order and the semantic and syntactic characteristics within the phases. In this study, we map tagged corpora into a placehol...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2021
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v35i15.17592